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Abstract 



X5 : 

["t I , We show that several versions of Floyd and Rivest's improved algorithm Select 

' for finding the kth. smallest of n elements require at most n + mm{k, n — k} + 

n) comparisons on average and with high probability. This rectifies the 
, analysis of Floyd and Rivest, and extends it to the case of nondistinct elements. 

Q I Encouraging computational results on large median-finding problems are reported. 

O . Key words. Selection, medians, computational complexity. 

> : 1 Introduction 
o . 

O ' The selection problem is defined as follows: Given a set X := {xj}^^i of n elements, a 
^ . total order < on X, and an integer 1 < k < n, find the kth smallest element of X, i.e., an 
Tjj- ! element x of X for which there are at most k — 1 elements Xj < x and at least k elements 
O I Xj < X. The median of X is the [?7,/2]th smallest element of X. 

^ I Selection is one of the fundamental problems in computer science; see, e.g., the refer- 

• ences in |DHUZOH IDoZ99l IDoZOlj and jKnu98| §5.3.3]. Most references concentrate on 
• r-j , the number of comparisons between pairs of elements made in selection algorithms. In the 
! worst case, selection needs at least {2 + e)n comparisons [ DoZOlj . whereas the algorithm of 
; |RFP+72j makes at most 5.43n, that of jSPP76j needs 3n + o(n), and that in (DoZM, takes 

2.9572 + o[n). In the average case, for k < \n/2], at least n + k — 0(1) comparisons are 
necessary |CuM89j . whereas Knuth's best upper bound is n + k + 0(?t,^/^ In^''^ n) |Knu98| 
Eq. (5.3.3.16)]. The classical algorithm Find of |Hoa61j . also known as quickselect, has 



an upper bound of 3.39n + o(n) for k = \n/2] in the average case |Knu98| Ex. 5.2.2-32], 
which improves to 2.75r;, + o{n) for median-of-3 pivots |Gru991 rKMP97j . 

The seminal papers jFlR75a| IFlR75bj presented three versions of the algorithm Select 
with very good average case performance, although their analysis had gaps, as noted in 
|PHKl'83j and |Knu981 Ex. 5.3.3-24]. Our recent papers jKiw03b( IKiw04j rectified the 
analysis of |FlR75bt §2.2] and extended it to the case of nondistinct elements. Specifically, 
we showed that several versions of Select, close to those in |FlR75b| §2.1] and |FlR75aj . 
make at most n + k + O^n"^^^ lia^^^ n) comparisons on average. 
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This paper concentrates on versions of the improved Select from |FlR75bt §2.3], again 
correcting its analysis and extending it to the case of nondistinct elements. We show that 
they make at most n + k + O^n^^"^ In^^^ n) comparisons on average. 

Thus, apparently for the first time, Knuth's best upper bound is attained by an imple- 
mentable algorithm without restrictive assumptions. Specifically, Knuth's scheme jKnu98| 
Ex. 5.3.3-24] is not formulated precisely enough to qualify as an algorithm, it requires dis- 
tinct elements in random order, and its samples are too large for efficient randomization 
(since generating a random sample of size [n/2] takes too much time; cf. ^6.3|) . 

We also prove that nonrecursive versions of Select, which employ other linear-time 
selection routines for small subproblems, require at most n+k+O^n^^"^ In^'^^ n) comparisons 
with high probability; we couldn't find such results in the literature. When sorting routines 
are used, the bound becomes n + k + O^n^^"^ In^''^ n). 

Since our interest is not merely theoretical, a serious effort was made to implement the 
various versions efficiently and to test them in practice. Our tests on the median-finding 
examples of |ValOO^ show that the improved Select is as fast as the ternary version of 
|Kiw04j ■ although a bit slower than the quintary version of |Kiwn3bj . All these versions 
perform very well in terms of the number of comparisons made on large inputs, the average 
numbers being about 1.6n for n = IM, and as small as 1.53n for n = 16M. Since the lower 
bound is 1.5n, little room for improvement remains. Of course, future work should assess 
more fully the relative merits of these versions, but clearly the improved Select may 
compete with other methods in both theory and practice. 

The paper is organized as follows. A simplified version of Select that ignores some 
roundings is introduced in ^ and its basic features are analyzed in ^ The average 
performance of SELECT and its practical rounded versions is studied in ^ High probability 
bounds for nonrecursive versions are derived in ^ Finally, our computational results are 
reported in ^ 

Our notation is fairly standard. |y4| denotes the cardinality of a set A. In a given 
probability space, P is the probability measure, and E is the mean-value operator. 



2 The algorithm Select 

We first recall that the standard version of Select proceeds as follows. By solving two 
pivot selection subproblems over a random sample S from X, two elements u and v almost 
sure to be just below and above the kth are found. The remaining elements are compared 
with u and v to derive a reduced selection problem on the elements between u and v 
that is solved recursively. In general, the size of the reduced problem (and hence its cost) 
diminishes when a larger sample is used, but then the cost of pivot selection grows. To 
balance these costs, the standard version employs a relatively small sample. In contrast, the 
improved version uses a much larger "final" sample S, but u and v are selected iteratively 
by using samples from S. More specifically, let 5*1 C • ■ ■ C Sf C S^+i = X be a nested 
series of random samples from X. For each sample Si, two pivots ui and vi are found such 
that ui < xl < vi with high probability, where xl is the kth element of X. In particular, 
ui = xl = vi when Si = X. For / < T, the positions of m^+i and w^+i in 5*^+1 are chosen 
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so that ui < < vi+i < vi with high probabihty, and hence ui and vi can be used to 
bound the search for ui+i and Vi+i. 

For clarity, we first describe Select in detail without some integer round-ups in sample 
sizes, etc.; more practical versions are postponed till ^4.21 

Algorithm 2.1. 

Select(X, k) (Selects the kth smallest element of X, with 1 < k < n := \X\) 

Step 1 {Initiation) . If n = 1, return Xi. Choose parameters a G (0, 1/2], Si := n", r > 1, 
K := 1/r, P > i(l — and T such that n = r^'si. Set 9 := k/n and / := 1. 

Step 2 {Initial sample selection). Draw a random sample Si of size si from X. Set 

._ f {[3si\nn)'/' if/<r, . . 

^'■"lo if/ = r+l, ^ ' 

il := max{\9si - gi],l} and := min { [6*5; + 51;] , } , (2.2) 

ui := Select(5'i, z^) and Vi := Select(S'i, z^) by using SELECT recursively. 

Step 3 {Sample selection). Draw a random sample 5";+! of size s;+i := r^s; from X such 
that Si C Si+i. (Here si+i — si elements of X \ S'; are picked randomly.) 

Step 4 {Partitioning) . By comparing each element x of S^+i \ Si to u := Ui and := Vi, 
partition 5"/+! into L := {x E Si+i : x < u}, U := {x E Si+i : x = u}, M := {x E S^+i : 
u < X < v}, V := {x E Si+i : x = v}, R := {x E Si+i : v < x}. If 9 < 1/2, x is compared 
to V first, and to u only if x < f . If 6' > 1/2, the order of the comparisons is reversed. 

Step 5 {Pivot selection), (a) Set gi+i, i^ := i''^^ and i^ := i''^^ via ()2.1|) - ()2.2|1 . (Here we 
wish to find ui^i and vi+i as the i^th and i^th smallest elements of 5*^+1.) 

(b) If \L\ < < |LUf/|, set ui+i := u; else if |LUf/ UM| < i+ < - \R\, set := v; 
else set -u^+i := Select(S'„, 2^), where Su and «^ are determined as follows. If i^ < \L\, 
set Su '.= L and i^ := i^; else if — |-R| < i^, set S'„ := R and ?^ := i^ — Si+i + \R\; 
else set S^ ■= M and i^ := — |L U [/|. 

(c) Find fi+i, and possibly S'^ and f^, as in (b) with i+ replaced by and ui+i by 
Step 6 {Loop). If sj+i = n, return m^+i. Otherwise, increase / by 1 and go to Step 3. 

A few remarks on the algorithm are in order. 

Remarks 2.2. (a) The correctness and finiteness of Select stem by induction from the 
following observations. At Step 2, IS*!! < |X|. At Step 5, S^ and are chosen so that 
the i^th smallest element of 5";+! is the f^th smallest element of S^, and |S'ti| < sj+i (since 
u,v ^ Su)', similarly for 5*^ and i^. The final loop with I = I has 5*;+! = X, gi^i = and 
iu = 9n = k, so M^+i = f^+i is the desired element. 

(b) After Step 5 the position of each element of Si+i relative to and f^+i is known. 
Hence Step 4 need only compare u and v with the elements of 5*;+! \ Si (e.g., via one of 
the quintary partitioning schemes of j Kiw03bl §6]). 

(c) The following elementary property is needed in §4.11 The maximum number of 
comparisons taken by SELECT on any input of size n is finite, for each n (because the 
recursive calls of Steps 2 and 5 deal with proper subsets of X). 
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3 Preliminary analysis 

In this section we analyze general features of sampling used by Select. 



3.1 Sampling deviations and expectation bounds 

Our analysis hinges on the following bound on the tail of the hypergeometric distribution 
established in ^Hoe63] and rederived shortly in |Chv79j . 

Fact 3.1. Let s balls be chosen uniformly at random from a set of s+ balls, of which p 
are red, and p' be the random variable representing the number of red balls drawn. Let 
p := p/ s+. Then 

P[p' >ps + g]<e-^'^'^' V^>0. (3.1) 

We shall also need a simple version of the (left) Chebyshev inequality |Kor78| §2.4.2]. 

Fact 3.2. Let t] be a nonnegative random variable such that P[r] < (] = 1 for some 
constant (. Then E?7 < t + (^P[?7 > t] for all nonnegative real numbers t. 

3.2 Sample ranks and partitioning efficiency 

In this subsection we analyze in detail a fixed iteration / of Select. 

For simpler notation, we drop / from the subscripts and superscripts and replace I + 1 
by +. Thus let < . . . y* and 2* < . . . z*^ denote the sorted elements of the samples S 
and 5*+, so that u = y* , v = y* , u+ = z*+ and v+ = z*+, where 

iu '■= max { [6*5 — (yf] , 1 } and i^ := min { \0s + g'], s} , (3.2) 

:= max { \6s+ — (7+] , 1 } and := min { \0s+ + (7+] , s+ } . (3.3) 

This notation facilitates showing that u < u+ < < v with high probability. To deduce 
that the number of elements between u and v is small enough, let 

iu ■■= max { \es+ - 2gs+/s~\ , 1 } and := min { \6s+ + 2gs+/s~\ , s+ } (3.4) 

be bounding indices; we shall see that z*^ < u < v < z*^ with high probability. Our 
argument is similar to that of |Kiwn3bl Lem. 3.3] because S may be regarded as a random 
sample from 5"+; the key difference is that (7+ 7^ in ()3.3|) if / < /, in which case g is replaced 
by (1 — K)g in our probability bounds. To this end, note that, since k := 1/r = (s/s+)^/^, 
(fTT]) yields 

q-q s/s = / - '')^ ^ < ^' (3 5) 

' ^ \^ g otherwise. ^ ' ^ 

Lemma 3.3. (a) P[u+ < u] < e-2(i-«)'9'/^ = - g] . 

(b) F[u<z*J< 

(c) F[v< v+] < e-2{i-'')'57^ if = \es + g] . 

(d) F[z* <v]< e-29'/^ 

(e) i.,^^\es~g] iffe<g/s- ^ \es + g-\ tffKO + g/s. 
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Proof, (a) If z*+ < y* , at least s — iu + 1 samples satisfy i/i > z^,-^ with j := maxz*=z* j- 



In the setting of Fact 13.11 we have p := — j red elements zj > Zjj^i, ps = s — js/sj^ and 
p' > s — iu + I- Since iu = \0s — g] < 9s — g + 1 and J > > 9s^ — g^ by ()3.3|) . we get 
s — iu + i — ps > js/s^ — 9s + g > g — g+s/s^; thus p' > ps + {1 — K)g by ()3.5|) . Hence 
< m] < P[p' > + (1 — and ()3.1|1 yields the conclusion, 
(b) If y*^ < z*^, iu samples are at most z*, where p := max2*<2* j. Thus we have p 
red elements zj < z*, ps = ps/s+ and p' > i^- Now, 1 < p < ju — 1 implies 2 < = 
\9s^ — 2gs^/ s\ by ()3.4|) and thus j^j < 9s^ — 2gs+/s + 1, so —ps/s+ > —9s + 2g. Hence 
iu — ps — g > 9s — g — ps/ s^ — g > 0, i.e., p' > ps + g; invoke (j3.1|) as before. 



(c) and (d): Argue symmetrically to (a) and (b); cf. |Kiw03bi. Proof of Lem. 3.3]. 



(e) Follows immediately from the properties of [•] |Knu97t §1.2.4]. □ 

We may now estimate the partitioning costs of Step 4. 

Lemma 3.4. Let c := ci denote the number of comparisons made at Step 4. Then 

P[c<c] > l-e-2f'/^ and Ec < c + 2(s+ - s)e~^f'/' with (3.6a) 

c := ( 1 + min{ 9,1 - 9}) {s+ - s) + 3gs+/s. (3.6b) 

Proof. Consider the event A := {c < c} and its complement A' := {c > c} . U u = v then 
c = s+ — s < c; hence P[^'] = P[A' H {u < v}], and we may assume u < v below. 

First, suppose 9 < 1/2. Then c = s+ — s + \{z G S+ \ S : z < v}\, since s+ — s 
elements of 5*+ \ S* are compared to v first. In particular, c < 2(s+ — s). li v < z*^, then 
{z e S+ : z < v} C {z e S+ : z < z*J gives \{z E S+ : z < v}\ < jy - 1 < 9s+ + 2gs+/s 
by ()3.4|) . whereas u < v implies \{z E S : z < v}\ > \{z E S : z < u}\ > in ^ ^s — g 
by (jS3), so |{2 G 5"+ \ : 2 < < 61(5+ - s) + 2gs+/s + g yields c < c. Thus 
M < f < z*^ implies ^. Therefore, A' H {u < v} implies {z*^ < f } fl {m < v}, so 
P[^' n {u < v}] < P[zl < v] < e-^sVs (^Lg^^ EUd)). Hence we have (jSSI), since 
Ec < c + 2(s+ - s)e-29'/s by Fact EIH (with v ■= c, C := 2(s+ - s)). 

Next, suppose 9 > 1/2. Now c = s+ — s + K^; G S+ \ S : -u < z}|, since s+ — s 
elements of 5+ \ 5 are compared to u first. If z*^ < u, then {2; G 5*+ : m < C {-z G 
5+ : 2;*^ < z} gives \{z E S+ : u < z}\ < s+ — j„ < s+ — 6^5+ + 2gs+/s, whereas u < v 
implies \{z E S : u < z}\ > \{z E S : v < z}\ >s — iu + 1> s — 9s — g + 1, so 
\{z E S+ \ S : u < z}\ < {1 - 9){s+ - s) + 2gs+/s + g-l yields c < c. Thus A' n {u < v} 
implies {u < z*J n{u<v},so F[A' n {u < v}] < P[u < z*J < e^^sV^ (Lem. ElStb)), and 
we get ()3.6|) as before. □ 

The following result will imply that the sets Su and 5*^, selected at Step 5 are "small 
enough" with high probability. Let s := Si := U S^l; we let Su := (or := 0) if Step 
5 doesn't use Su (or Su), but we don't consider this case explicitly. 

Lemma 3.5. P [s < Ags+/s] > 1 — Pfaii and s < s+ always, where 

Pfaii := Pfaii(ri) := 2e-''^"/' + 2e-^^'-^^"'^"/' = 2n~^f + 2n-^^'~^^'^ < 4n'2{i-K)^/3_ (3 7) 
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Proof. First, consider the middle case of iu = \0s — g] and = \0s + g]. Let £ 
denote the event z* < u < u+ < v+ < v < z* . By Lem. 13.31 and the Boole-Benferroni 
inequahty, its complement £' has P[S'] < Pfaii, so > 1— Pfau. By the rules of Steps 4-5, 
u < u+ < v+ < V implies Sul-iSy C M, whereas z*^ < u < v < z*^ yields s < + 1 — 2; 

since < 6s+ + 2gs+/s + 1 and ju > Os+ — 2gs+/s by ()3.4j] . we get s < Ags^/s. Hence 
P[s < Ags+fs] >F[S]. Then dHUj) follows from (gill) and the fact k e (0, 1). 

Next, consider the left case of iu 7^ \9s — g], i.e., 9 < g/s (Lem. EIHte)). Ifiy 7^ lOs + g], 
then 1 < 9 + g/s (Lem. ElS^e)) gives s < s+ < 2gs^/s. For i^ = \9s + g~\, P[f+ < f < 
Zj^] > 1 — |Pfaii by Lem. I3.3f c.d). Now, f + < f implies U 5*^, C L U M, whereas v < z*^ 
gives s < ju — I < 9s+ + 2gs+/s < Sgs^/s; hence P[s < Ags^/s] > P[f+ < v < z*J. 

Finally, consider the right case of iu 7^ \9s + g] , i.e., 1 < 9 + g/s. li iu ^ \9s — g] then 
9 < g/s gives s < s+ < 2gs+/ s. For i„ = \9s — (?] , we have 'P[z*^ <u< > 1 — |Pfaii by 
Lem. I3.3r a.b). Now, u <u+ implies S'^ U C M U -R, whereas z*^ < u yields s < s+ — ju 
with ju > 9s^ — 2gs^/s and thus s < 3gs+/s, so P[s < Ags^/s] > P[z*^ < u < u+]. U 

Corollary 3.6. P [c < c and s < Ags^/s] > 1 — Pfau- 

Proof. If 25f/s > 1 then c < 2(s+ — s) < c (cf ()3.6b|) ) and s < s+ < 4(yfs+/s, so assume 
2g/s < 1. The conclusion follows from the proofs of Lems. and We only note that 
the left case of 9 < g/s now has iu = \9s + g] and 9 < 1/2. Similarly, in the right case of 
1 < 9 + g/s, we have iu = \9s — g] and 9 > 1/2, since g/ s < 1/2. D 

Remark 3.7. Suppose for / < I, Step 5 resets i^ := i^ if 9 < gi+i/si+i, or i+ := i^ if 
1 < + gi+i/si+i, finding a single pivot m+ = in these cases. The preceding results 
remain valid for this modification (which corresponds to using u := v if 9 < g / s, or v := u 
if 1 < 9 + g/s). Similarly, Step 2 may reset := il if 9 < gi/si, or il := if 1 < 9 + gi/si. 

4 Average performance of the recursive version 
4.1 Analysis of the nonrounded version 

In this section we analyze the average performance of Select, starting with the "non- 
rounded" version of Algorithm 12.11 more practical versions are discussed in ^4.21 

Theorem 4.1. Let Cnk denote the expected number of comparisons made by Select, and 
f{t) := (tlnt)^/^ for t > 1. There exists a positive constant 7 such that 

Cnk ^ n + min{ k,n — k} + lf{n) for all 1 < k < n. (4.1) 

Proof. We need a few preliminary facts. The function 0(t) := f{t)/t = (Int/t)^/^ de- 
creases to on [e, 00), whereas f{t) grows to infinity on [2, 00). The key bounding property 
is f(t) = (j){t)t < (t){t)t for alH > t > e. Pick n>2 large enough so that si > e, Ar'^gi > e, 
+ 1 < f[n) and n < r^si for all n > n. Using a G (0, 1/2] and the bounding property, 
we have 

si<fin) and /(si) < 0(si)/(n). (4.2) 
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By ()3.7p and our assumption /? > ^(1 — k) ^, we have ?2Pfaii(n) = o{f{n)); more precisely, 
nPfaiiH < 4^1-2(1-'^)'^ = 4/ (n)n^/2-2(i-.)2;3 1^-1/2 ^_ ^4 3) 

Using the monotonicity of 0, we may increase n if necessary to get for all n > n 



r — 1 



^2 _ 1 



since each term above goes to as n increases to oo. By Rem. I2.2r c). there is 7 such that 
(j4.Hl holds for all n < ra; increasing 7 if necessary, we have for all n > n 



3 + i5^!l_!:/3i/2 + ^ e.SrJ 3.5 ^i/2-2(i-.)^/3 1^-1/2 ^ < Q_Q5^_ 



r — 1 



— 1 



(4.5) 



Let n' > n. Assuming ()4.1|) holds for all n < n', for induction let n = ri' + 1. 
Since Si < n, by our hypothesis the cost of selecting ui and Vi at Step 2 is at most 



<^siii < 3si + 27/(si). 



(4.6) 



Similarly, the cost of selecting ui+i and at Step 5 is at most 3si + 2jf{si), where si < 
Sz+i and P[s; > AgiSiJ^^i/si] < Pfau by Lem. 13.51 Hence (cf. Fact 13. 21 with r] := 3si + 2jf{si)) 

E[3si + 2^f{si) ] < I2gisi+i/si + 2'yf{4giSi+i/si) + [ 3s,+i + 27/(5^+1) ] Pfau, 1 = 1:1 

(4.7) 

For 6 := min{6', 1 — 6}, the partitioning cost of Step 4 is estimated by (|3.6p as 



Eci<[l + e){si+i-si) + 3giSi+i/si + ^{si+i-si)Fi^ii, 1 = 1:1. {A.t 
Adding the costs ()4.fi|) - ()4.8j) and using Sf+i = n, we get 



Cnk<(l + 0) (n-si) + 



3Si + 15 ^ giSl+l/ Si + |Pfail(n - Si) + 3Pfail Si+i 



1=1 



1=1 



+ 27 



fiSl) + E fi4:9lSl+l/Sl) + Pfail E /(^i 



(4.9a) 
(4.9b) 



Since 6 := k/n, the first term on the right side above is at most n + mm{k, n — k}. Next, 
for d := (/?lnn)^/^, 1)2.11) yields ^f^Si+i/s/ = dsi+i/s]^"^ for / < I. Since Si = r'^^^'^hi for 
/ < I, and n > r'^^'-~^'>si implies r'"-*^ < (n/si)^/^, we obtain 



^gisi+i/si = Ydr^^^sY"^ = dr'^Si'^ 



1=1 



1=1 



r — 1 



r — 1 



But gjsi_^^/si = dn/s]''^ = (3^/'^f{n){n/si)^/'^ and n < r'^sj imply giSi^i/ sj < l3^/^f{n)r, so 



Y.91S1+1/S1 < 13'/' fin) i^-—^+rj= P'/'fin) 
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2r^ — f 
r — 1 



(4.10) 



Similarly, using s;+i = r^'si for / < /, S[+i = n and r^*^' < n/si, we get 

^ ^^21 2 r2('-i)-l ^n-si 2r2-l 

> s/+i = si> r +n = rsi — ^ \- n < r — — - + n < ^ — —n. (4.11) 

^ ^ — 1 — 1 — 1 

Plugging dOI), dlSl), (ICTH) and (HrTTll into (jOaj), we see that the bracketed term is at 
most 0.057/(n) thanks to (jOj) . Next, for / < I we have 45(iS;+i/si > Ar'^gi (cf. flTTj)). 
whereas QiSij^i/ sj < P^^'^f{n)r with AP^^'^ f{n)r > Ar'^gi from n > r^Si; therefore, we may 
use the bounding property and argue as for (j4.1(jp to get 



Y^f{Ag,s,^,/s{) < 0(4rV)4 ( Y.9iSi+i/ Si + ^''^ f {n)r J < 4-— ^/5^/20(4rV)/H. 

(4.12) 

Similarly, s/+i = r^'si > r^Si for / < / and sjj^i = n > r^Si together with ()4.11|) imply 

I I n^2 _ 1 

E /(^m) < E < ^3-0(r^Si)n. (4.13) 

Now, plugging (jOI), and (^3^ combined with (jOl) into ()4.9bjl . we deduce that 



4.9b|l is at most 0.957/(n) due to ()4.4j) : thus ()4.1|1 holds as required. □ 



4.2 Analysis of rounded versions 

We now consider more realistic parameter choices for Select. 

Fixing a G (0 
Steps 1 and 3 set 



Fixing a G (0, 1/2], r > 1 such that is integer, k := 1/r, [3 > Ml — n) ^, suppose 



Si := min { [n"] , n — 1 } , 
r := min I / : r^'si > n I = ln(r;,/si)/ Inr^ 

S;+i := min | r^'si, | = min | r^s/, n | . 



(4.14) 
(4.15) 

(4.16) 

n > r'^^''~^^si. It is easy to see 



Note that ^J^-^JB) yield si+i = r^'si if Z < /, sj+i 
that the proof of Theorem 14.11 covers this modification. 

The final iteration / doesn't need sampling, since Sf^i = X. Hence, to reduce the 
sampling costs, we may wish to ensure that sj, the number of sampled elements, is at most 
a fixed fraction fj G (1/t^, 1] of n when n is large. To this end, suppose that for 



n > max { [r7(r/r2 - 1)]^/", 3} with r7G(l/r^l], 
we replace pmil - (HrTH|) by 

r := min I /: r^'n" > n I = (1 — a) Inn/ In 1 



Si 



(4.17) 

(4.18) 
(4.19) 
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Then n'^/r'^ < Si < [n"] < n replaces ()4.14|) . ()4.15|) remains true and 

Sf < rjn with 77 := + < f/. (4.20) 

Indeed, n < r'^'-n"' < r'^n implies n°'/r'^ < Si < [n"] ; since n°' < n^/^ < n — 1 for n > 3, we 
have [n"] < n. Next, n/r^' > n^/r^ = l/{r]r'^ — 1) yields rin/ r'^^''~^^ > n/r"^^ + 1 > Si; thus 
r]n > r^*^'~^''si. But n°' > r'^/(fjr'^ — 1) implies rj < f] < 1, so r'^^''~^^si < n, ()4.15j) holds and 
()4.1(i|l gives Si < rjn. In effect. Theorem 14.11 holds for this modification. 

4.3 Using smaller rank gaps 

Although the gaps gi of ()2.1|) give useful high probability bounds (cf. in practice the 
average performance on small problems improves for the smaller gaps 

gr.= {(3si\nsi)'/^ for / < T. (4.21) 

Assuming (3 > — 1^)^^, we now sketch briefly how to extend the previous results. First, 
ip{s) := [1 — K(l + lnry Ins)^/^]^ replaces (1 — k)^ in the relations of ^3.21 and (j3.7|) becomes 

Pfaii := Pfaii(s) := 2e-2f'/^ + 2e-^^^'^^'/' = 23^^^ + 2s-^^^^'^ < As-^^'^^'l (4.22) 

For n such that 2pip{si) > 1/2 for all n > n, (glTjl-diH) now involve Pfaii(s,) < 45;"^^^ so 
()4.9|1 is modified accordingly, whereas ()4.1H) and ()4.13j) are replaced by 

E ^^+iPfaii(=^0/4 < E r's'/' + n'l\ < r'- ^ + n'/'r < '-^-—^n'/\ (4.23) 

E /(^^+i)Pfaii(sO < E ^^+iPfaii(^0 < 4 -0(r2si)ni/2. (4.24) 

Modify the third terms of ()4.4|) - ()4.5p to complete the proof of Theorem 14. II as before. 

4.4 Handling small subfiles 

Since the sampling efficiency decreases when X shrinks, consider the following modifica- 
tion. For a fixed cut-off parameter ricnt > 1, let sSelect(X, /c) be a "small-select" routine 
that finds the fcth smallest element of X in at most Ccut < 00 comparisons when |X| < ncut 
(even bubble sort will do). Then Select is modified to start with the following 

Step {Small file case). If n := |X| < ncut, return sSelect(X, fc). 

Our preceding results remain valid for this modification. In fact it suffices if Ccut 
bounds the expected number of comparisons of sSelect(X, /c) for n < ricnt- For instance, 
()4.1|) holds for n < ricut and 7 > Ccut, and by induction as in Rem. l2.2r c) we have Cnk < 00 
for all n, which suffices for the proof of Theorem 14.11 
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5 Analysis of nonrecursive versions 

Consider a nonrecursive version of Select in which Steps 2 and 5, instead of Select, em- 
ploy a hnear-time routine (e.g., Pick |BFP"'"72 ]) that finds the ith smallest of m elements 
in at most 7pm comparisons for some constant 7p > 2. 

Theorem 5.1. Let Cnk denote the number of comparisons made by the nonrecursive ver- 
sion of Select, using ()4.14j) - ()4.1(ij) . Then for n>6, we have 

P [ Cnk < n + min{ k,n — k} + ^pf{n) ] > 1 — /Pfaii with (5.1a) 

7P := 27P + -(3 + ^ip)l3"\ (5.1b) 

r — 1 



/Pfail < 4 



1 -a) Inn/ In n-^^^-"^ ^. (5.1c) 



In particular, /Pfaii = o{n ^) if {3 > ^{1 — k) ^. Moreover, 

Ec„fc < ''^ + niin{ A;, n — A; } + 7p/(n) with (5.2a) 

■ 2r2 - 1 
— 1 



7p:=7p + 4( f!^-i27p + 1/2 lni/2-2(i-«)2/3 1^-1/2 ^_ ^5_2b) 



In particular, ■yp < + I67P + 2 if [3 > \{1 



Proof. The cost of Step 2 is at most 27pSi, with si < [n-*^/^] < f{n) < n — 1, since ri > 6. 
For 9 := min{^^, 1 — ^}, the cost of Steps 4 and 5 at iteration / is at most 

Ci:= i^l + e^ (s;+i - si) + 3giSi+i/si + 2-fp ■ AgiSi+i/si (5.3) 

with probability at least 1 — Pfaii by ()3.6jl and Cor. 13.61 Hence Cnk exceeds 

_ T _ _ I 

C := 27PS1 + Y.Ci = 27PS1 + ( 1 + ^) (n - Si) + (3 + 87p) J2 9iSi+i/si 
1=1 1=1 

with probability at most TPfaii. But C < n + min{fc, n — k} + ^pf{n) by ()4.10|) and ()5.1b|) . 
so (ICTll follows. Then dHIIj) and (H^THll with si > n" yield (ETc|l . 

Similarly, Ec„fc < 27pSi + I]Li(Eq + 27pEs/); bounding these costs as for ()4.7|l - ()4.8j) 
via (Q, (ftrrHl and (jCTT]) gives (|E21)- □ 

Remarks 5.2. (a) The bound ()5.2|) holds if Steps 2 and 5 employ a routine (e.g.. Find 
|Hoa61p for which the expected number of comparisons to find the ith smallest of m 
elements is at most 7pm (then Ecnk is bounded as before). 

(b) Suppose Step 5 returns to Step 2 if > igiSi^i/si. By Cor. 13.61 such loops are 
finite wp 1, and don't occur with high probability, for n large enough. 

(c) Suppose Steps 2 and 5 simply sort S and Su U by any algorithm that takes at 
most 75771 In m comparisons to sort m elements for a constant 75. Then the cost of Step 
2 is at most 75S1 Inn, because Si < n; hence 7s'lnn may replace 27p in ()5.1b|) . Similarly, 
75 Inn replaces 7p in (j5.3|) and ()5.2b|) . and 475 Inn replaces 87P in ()5.1b|) . In other words, 
^1/2 1^3/2 ^ replaces f{n) in (j5.1a|) and (j5.2a|) for suitably redefined 7p and 7p. 
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6 Experimental results 



6.1 Implemented algorithms 

An implementation of Select was programmed in Fortran 77 and run on a notebook 
PC (Pentium 4M 2 GHz, 768 MB RAM) under MS Windows XP. The input set X was 
specified as a double precision array, and the partitioning schemes of |Kiwn8b| §6] were 
used. For efficiency, small arrays with n < n^nt were handled by sSelect (cf. ^4.4|) . which 
typically required less than 3.5n comparisons. We used n^ut = 600 as proposed in |FlR75aj . 
a = 0.5, P = 0.3 in ()4.2H1 . r = 12 and fj = 2/r^; future work should test other parameters. 

6.2 Testing examples 

As in |Kiw03bj . we used minor modifications of the input sequences of |ValOOj : 

random A random permutation of the integers 1 through n. 
onezero A random permutation of \n/2] ones and \ n/2\ zeros, 
sorted The integers 1 through n in increasing order, 
organpipe The integers (1, 2, ... , n/2, n/2, . . . , 2, 1). 

For each input sequence, its (lower) median element was selected for k := [?^/2'|. To save 
space, we only add that the results for the twofaced, rotated and m3killer sequences of 
|Kiw03bj were similar to those of the random, sorted and organpipe inputs, respectively. 

6.3 Computational results 

We varied the input size n from 50,000 to 16,000,000. For the random and onezero se- 
quences, for each input size, 20 instances were randomly generated; for the deterministic 
sequences, 20 runs were made to measure the solution time. 

The performance of Select is summarized in Table IU?Tl where the average, maximum 
and minimum solution times are in milliseconds, and the comparison counts are in multiples 
of n; e.g., column six gives C^vg/n, where Cavg is the average number of comparisons made 
over all instances. Thus 7avg := (C'avg — l-5n)//(n) estimates the constant 7 in the bound 
fl4.1|) : moreover, for large n we have Cavg ~ l-5ivavg, where Lavg is the average sum of sizes 
of partitioned arrays. Further, Pavg is the average number of SELECT partitions, whereas 
iVavg is the average number of calls to sSelect and Pavg is the average number of sSelect 
partitions per call; both Pavg and Ng^^g grow slowly with Inn. Finally, Savg is the average 
number of sampled elements; as predicted by ()4.20|) . Sg^^^/n is about ~ 0.69% for 
large n. The average solution times grow linearly with n (except for small inputs whose 
solution times couldn't be measured accurately), and the differences between maximum 
and minimum times are quite small (and also partly due to the operating system). Except 
for the smallest inputs, the maximum and minimum numbers of comparisons are quite 
close, and Cavg nicely approaches the theoretical lower bound of 1.5n; this is reflected in 
the values of 7avg (which are amazingly stable). The results for the onezero inputs agree 
completely with our theoretical predictions. 
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Table 6.1: Performance of Select on randomly generated inputs. 



Input 



random 



onezero 



sorted 



organpipe 



Size 


Time [msec] 


Comparisons [ 


n] 


Tavg 




avg 


p 

avg 


-^avg 




avg 


-5 avg 


n 


avg 


max 


min 


avg 


max 


min 




[n] 


[In n] 


[In n] 






[%n] 


50K 


2 


10 





1, 


.89 


2.05 


1 


.80 


26.52 


1 


.23 


0.40 


0.90 


5, 


.50 


1.13 


lOOK 


3 


10 





1 


.79 


1.85 


1 


.70 


26.61 


1 


.17 


0.41 


0.91 


5, 


.50 


0.89 


500K 


12 


20 


10 


1, 


.64 


1.66 


1 


.60 


26.93 


1 


.08 


0.58 


1.16 


5, 


.74 


0.81 


IM 


24 


30 


20 


1, 


.60 


1.61 


1 


.58 


26.61 


1 


.06 


0.64 


1.29 


5, 


.83 


0.76 


2M 


44 


50 


40 


1, 


.57 


1.58 


1 


.56 


26.96 


1 


.04 


0.68 


1.41 


5, 


.81 


0.73 


4M 


87 


90 


80 


1, 


.55 


1.56 


1 


.54 


26.63 


1 


.03 


0.69 


1.45 


6, 


.26 


0.72 


8M 


167 


171 


160 


1, 


.54 


1.54 


1 


.53 


25.81 


1 


.02 


0.75 


1.55 


5, 


.98 


0.71 


16M 


331 


341 


330 


1, 


.53 


1.53 


1, 


.52 


26.75 


1 


.01 


0.82 


1.70 


6, 


.12 


0.71 


50K 


1 


11 





1, 


.50 


1.50 


1, 


.50 


0.01 


1 


.00 


0.18 


0.14 


1, 


.10 


0.86 


lOOK 


4 


10 





1, 


.50 


1.50 


1, 


.50 


0.02 


1 


.03 


0.18 


0.15 


1, 


.14 


0.74 


500K 


15 


20 


10 


1, 


.50 


1.50 


1, 


.50 


0.00 


1 


.00 


0.16 


0.15 


1, 


.18 


0.72 


IM 


29 


31 


20 


1, 


.50 


1.50 


1, 


.50 


0.01 


1 


.00 


0.14 


0.14 


1, 


.35 


0.71 


2M 


58 


61 


50 


1, 


.50 


1.50 


1, 


.50 


0.01 


1 


.00 


0.14 


0.14 


1, 


.30 


0.70 


4M 


118 


121 


110 


1, 


.50 


1.50 


1, 


.50 


0.01 


1 


.00 


0.13 


0.13 


1, 


.25 


0.69 


8M 


234 


241 


230 


1, 


.50 


1.50 


1, 


.50 


0.01 


1 


.00 


0.13 


0.13 


1, 


.25 


0.69 


16M 


470 


471 


461 


1, 


.50 


1.50 


1 


.50 


0.02 


1 


.00 


0.19 


0.18 


1, 


.15 


0.70 


50K 


1 


10 





1, 


.89 


2.22 


1 


.75 


26.45 


1 


.26 


0.41 


0.91 


5, 


.97 


1.15 


lOOK 


2 


10 





1, 


.80 


1.87 


1, 


.64 


28.32 


1 


.18 


0.42 


0.92 


6, 


.16 


0.90 


500K 


8 


11 





1, 


.64 


1.66 


1, 


.61 


26.84 


1 


.08 


0.60 


1.20 


6, 


.00 


0.81 


IM 


14 


20 


10 


1, 


.60 


1.61 


1, 


.58 


26.41 


1 


.05 


0.66 


1.32 


5, 


.94 


0.76 


2M 


26 


30 


20 


1, 


.58 


1.59 


1, 


.57 


27.96 


1 


.04 


0.68 


1.41 


5, 


.89 


0.73 


4M 


47 


51 


40 


1, 


.55 


1.56 


1, 


.54 


26.72 


1 


.03 


0.69 


1.45 


6, 


.17 


0.72 


8M 


91 


100 


90 


1 


.54 


1.54 


1, 


.53 


25.89 


1 


.02 


0.73 


1.53 


6, 


.02 


0.71 


16M 


179 


190 


170 


1 


.53 


1.53 


1, 


.52 


26.03 


1 


.01 


0.83 


1.71 


6, 


.19 


0.71 


50K 











1, 


.90 


2.18 


1, 


.81 


26.85 


1 


.24 


0.40 


0.89 


5, 


.17 


1.15 


lOOK 


2 


10 





1 


.78 


1.88 


1 


.71 


26.20 


1 


.17 


0.41 


0.90 


5, 


.82 


0.89 


500K 


8 


10 





1, 


.64 


1.67 


1 


.61 


27.19 


1 


.08 


0.58 


1.16 


5, 


.85 


0.81 


IM 


16 


20 


10 


1, 


.60 


1.61 


1 


.59 


26.05 


1 


.06 


0.64 


1.29 


5, 


.88 


0.76 


2M 


31 


40 


30 


1, 


.57 


1.58 


1, 


.55 


26.99 


1 


.04 


0.67 


1.40 


6, 


.08 


0.73 


4M 


59 


61 


50 


1, 


.55 


1.56 


1, 


.54 


25.59 


1 


.03 


0.69 


1.44 


6, 


.05 


0.72 


8M 


116 


121 


110 


1, 


.54 


1.54 


1, 


.53 


26.63 


1 


.02 


0.71 


1.49 


6, 


.23 


0.71 


16M 


228 


240 


220 


1 


.53 


1.53 


1, 


.52 


25.67 


1 


.01 


0.83 


1.71 


5, 


.96 


0.71 



For our parameters a = 0.5 and fj = the test ()4.17|) is equivalent to n > r'*, so 

fj4.14|) operates only for small n < = 20,736. Table highlights the danger of choosing 
Si by ()4.14p alone (note that for f/ = 1.000001r~^, ()4.17p couldn't hold, being equivalent 
to n > lO^^r"^). Although Savg increased quite dramatically (cf. Tab. EH}, Cavg decreased 
slightly for larger n only, 7avg was less stable and the computing times grew significantly; 
similar deteriorations occured for other inputs. 

Although it is not clear how to implement the theoretical scheme of Knuth |Knu98| 
Ex. 5.3.3-24], we tried to emulate it by using = 2 and ()4.21|) replaced for / < Tby 

gr.= imin{9,l-9}si\nsif\ (6.1) 
Relative to Tab. I6.H this scheme made about 3% more comparisons for small n, but was 
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Table 6.2: Performance of Select with rj = 1.000001/r^ on random inputs. 



Input 


Size 


Time [msec] 


Comparisons 




7avg 




p 

^ avg 


^ * avg 


Pavg 


c 




n 


avg 


max 


min 


avg 


max 


min 




[n] 


[In n] 


[Inn] 




[%n] 


random 


50K 


7 


11 





2.03 


2.10 


1.94 


35.84 


1.35 


0.66 


1.41 


6.43 


64.99 




lOOK 


11 


20 


10 


1.82 


1.89 


1.76 


29.49 


1.21 


0.65 


1.38 


6.48 


45.92 




500K 


41 


50 


40 


1.62 


1.64 


1.60 


22.69 


1.07 


0.77 


1.62 


6.37 


20.48 




IM 


70 


91 


60 


1.58 


1.59 


1.56 


20.64 


1.05 


0.80 


1.66 


6.37 


14.45 




2M 


106 


111 


100 


1.55 


1.56 


1.54 


18.75 


1.03 


0.87 


1.81 


6.10 


10.22 




4M 


175 


181 


170 


1.54 


1.54 


1.53 


19.07 


1.02 


1.14 


2.34 


6.27 


7.94 




8M 


292 


301 


290 


1.53 


1.53 


1.52 


18.87 


1.02 


1.32 


2.70 


6.17 


5.81 




16M 


498 


501 


491 


1.52 


1.52 


1.52 


18.42 


1.01 


1.34 


2.75 


6.40 


4.03 



about 9.5 times slower due to the random sampling overheads (with Savg between 52% 
and 57%). Eliminating randomization gave the results of Table Not suprisingly, this 



Table 6.3: Performance of Select with Knuth's gap ()6.H) and no randomization. 



Input 


Size 


Time [msec] 


Comparisons 


[n] 


7avg 






^ avg 


Pavg 




n 


avg 


max 


min 


avg 


max 


min 




[n] 


[Inn] 


[In n] 




random 


50K 


4 


10 





1.99 


2.15 


1.87 


32.98 


1.42 


3.35 


6.08 


5.18 




lOOK 


4 


10 





1.86 


2.09 


1.77 


33.13 


1.31 


4.40 


7.95 


4.95 




500K 


15 


20 


10 


1.67 


2.01 


1.63 


32.55 


1.14 


7.09 


12.65 


5.01 




IM 


33 


41 


30 


1.67 


2.01 


1.59 


44.80 


1.15 


8.84 


15.49 


5.03 




2M 


60 


70 


50 


1.61 


1.81 


1.56 


39.10 


1.09 


9.23 


16.57 


5.29 




4M 


118 


121 


110 


1.57 


1.67 


1.55 


33.66 


1.06 


12.51 


21.86 


5.08 




8M 


244 


300 


240 


1.55 


1.81 


1.53 


34.39 


1.04 


13.95 


24.56 


5.16 




16M 


493 


601 


460 


1.58 


1.81 


1.52 


81.48 


1.08 


18.07 


30.75 


5.09 


onezero 


8M 


297 


301 


290 


1.50 


1.50 


1.50 


0.09 


1.00 


1.45 


0.19 


1.15 




16M 


582 


591 


580 


1.50 


1.50 


1.50 


0.11 


1.00 


1.45 


0.18 


1.10 


sorted 


50K 


23 


30 


20 


46.19 


46.19 


46.19 




39.86 


216.3 


366.0 


5.18 




lOOK 


56 


61 


50 


56.16 


56.16 


56.16 




48.59 


471.0 


776.0 


5.16 




500K 


410 


421 


400 


85.83 


85.83 


85.83 




75.16 






5.37 




8M 


13625 


13690 


13579 










147.7 






5.29 




16M 


32095 


32186 


31986 










175.7 






5.42 


organpipe 


8M 


7238 


7281 


7200 


81.08 


81.08 


81.08 




71.59 






5.06 




16M 


16486 


16564 


16453 


90.76 


90.76 


90.76 




80.55 






5.18 



scheme performed fairly well on the random inputs, but quite badly on the deterministic 
inputs (where "***" denote values exceeding the printout format). 

Finally, comparing Tab. 16.11 with |Kiw03bl Tabs. 7.1-7.2], we add that Select was 
slightly slower than its counterpart of |Kiw03bj . although the numbers of comparisons 
made were similar for large n. In fact for small inputs, the ternary version of [ Kiw04] 
made fewest comparisons. The experimental results of |Kiw03a| IKiw03bj suggest that 
Select can compete successfully with refined implementations of quickselect. 

Acknowledgment. I would like to thank Olgierd Hryniewicz, Roger Koenker, Ronald 
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